Using argumentation to extract key sentences from biomedical abstracts

نویسندگان

  • Patrick Ruch
  • Célia Boyer
  • Christine Chichester
  • Imad Tbahriti
  • Antoine Geissbühler
  • Paul Fabry
  • Julien Gobeill
  • Violaine Pillet
  • Dietrich Rebholz-Schuhmann
  • Christian Lovis
  • Anne-Lise Veuthey
چکیده

PROBLEM key word assignment has been largely used in MEDLINE to provide an indicative "gist" of the content of articles and to help retrieving biomedical articles. Abstracts are also used for this purpose. However with usually more than 300 words, MEDLINE abstracts can still be regarded as long documents; therefore we design a system to select a unique key sentence. This key sentence must be indicative of the article's content and we assume that abstract's conclusions are good candidates. We design and assess the performance of an automatic key sentence selector, which classifies sentences into four argumentative moves: PURPOSE, METHODS, RESULTS and CONCLUSION METHODS we rely on Bayesian classifiers trained on automatically acquired data. Features representation, selection and weighting are reported and classification effectiveness is evaluated on the four classes using confusion matrices. We also explore the use of simple heuristics to take the position of sentences into account. Recall, precision and F-scores are computed for the CONCLUSION class. For the CONCLUSION class, the F-score reaches 84%. Automatic argumentative classification using Bayesian learners is feasible on MEDLINE abstracts and should help user navigation in such repositories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentence retrieval for abstracts of randomized controlled trials

BACKGROUND The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician's ability to search the primary literature. Randomized clinical trials (RCTs) are the most reliable so...

متن کامل

Relevance of Cluster size in MMR based Summarizer : A Report

ion of documents by humans is complex to model as is any other information processing by humans. The abstracts differ from person to person, and usually vary in the style, language and detail. The process of abstraction is complex to be formulated mathematically or logically [14]. In the last decade some systems have been developed that generate abstractions using the latest natural language pr...

متن کامل

Functional gene clustering via gene annotation sentences, MeSH and GO keywords from biomedical literature

Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to...

متن کامل

مقایسه روش‌های مختلف یادگیری ماشین در خلاصه‌سازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of medical informatics

دوره 76 2-3  شماره 

صفحات  -

تاریخ انتشار 2007